-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Combining filter rewrite and skip list to optimize sub aggregation #19573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ing sub aggregation Signed-off-by: Ankit Jain <[email protected]>
Signed-off-by: Ankit Jain <[email protected]>
Signed-off-by: Ankit Jain <[email protected]>
❌ Gradle check result for 82bc95d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Ankit Jain <[email protected]>
❌ Gradle check result for aff3dc6: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Ankit Jain <[email protected]>
I think big5 snapshot for 3.4 should have skiplist enabled for @timestamp. I tried out this change on big5, 20% data, sorted by @timestamp (default is no sort). [ec2-user@ip-172-31-61-197 ~]$ opensearch-benchmark compare -b 75193634-a102-4dc0-bff7-8e77af36a358 -c bb6304f5-89de-4bcb-a4a9-df2d2d3e5a20 / __ ____ ___ ____ / / ____ / / / __ ) ____ / / ____ ___ ____ / / Comparing baseline with contender
|
So AutoDateHistogram is different than DateHistogram, will need to add skiplist logic there as well: ![]() range_auto_date_histo_baseline.html |
Fixed the issue with my setup, it wasn't hitting the skiplist path. Now with skiplist its 20-30% worst latency. Async profile attached.
range_auto_date_histo_baseline.html |
Signed-off-by: Ankit Jain <[email protected]>
Thanks @asimmahmood1 for sharing these flame graphs. I am kind of surprised to not see
We should also verify the correctness of results by comparing the output. |
// TODO: SkipListLeafCollector should be used if the getLeafCollector invocation is from | ||
// filterRewriteOptimizationContext when parent != null. Removing the check to collect | ||
// performance numbers for now | ||
return new HistogramSkiplistLeafCollector(singleton, skipper, preparedRounding, getBucketOrds(), sub, this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it turns out this preparedRound is too granular, so upToSameBucket is always false,
i.e. within the same skipList block (4k) the min != max.
Auto date histogram does some estimation of rounding, based on some starting interval, so we'll need to same within Skiplist version.
For now the game plan is:
- Starting with auto date at the top level, find a reasonable rounding (see
getRounding()
method) - Then try out for sub agg, where for each top level bucket ord we'll need to cal a new rounding. One which is to use the top level skiplist to find maxDocId and use its value as max.
Since auto date histo needs additional logic for prepare rounding, if just use data histo as sub agg, it still provides a 10 improvement, from 890ms to 88ms!
|
Signed-off-by: Ankit Jain <[email protected]>
Signed-off-by: Ankit Jain <[email protected]>
Signed-off-by: Ankit Jain <[email protected]>
Signed-off-by: Ankit Jain <[email protected]>
Signed-off-by: Ankit Jain <[email protected]>
❌ Gradle check result for 65192fd: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Ankit Jain <[email protected]>
Signed-off-by: Ankit Jain <[email protected]>
❌ Gradle check result for 866937d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Ankit Jain <[email protected]>
Let me check if its functionally correct:
response
|
❌ Gradle check result for 2d697fc: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
With metricsThis is with 1% index size so numbersa not comparable to big5's benchmark, plus this is date histogram (with day interval),
|
Signed-off-by: Ankit Jain <[email protected]>
Signed-off-by: Ankit Jain <[email protected]>
Signed-off-by: Ankit Jain <[email protected]>
❌ Gradle check result for 729bf0a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Description
This change aims at optimizing the sub aggregation by leveraging multi range traversal for top level aggregation, and skip list for the sub aggregation.
I have copied over the
BitSetDocIdStream
class in OpenSearch from Lucene for now as it is not public, but should look at eventually getting rid of it.Related Issues
Related to #17447, #19384
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.